An Empirical Comparison between Global and Greedy-like Search for Feature Selection

نویسندگان

  • Ibrahim F. Imam
  • Haleh Vafaie
چکیده

The paper presents a comparison between two feature selection methods; the Importance Score (IS) and a genetic algorithm-based (GA) method. The goal of both is to achieve better performing rules produced by the AQ15 learning system. The IS method performs a greedy-like search based on an attributional score that represents the importance of each attribute in classifying the decision classes. IS uses the rule testing system Atest to evaluate the performance of the selected feature sets. The genetic algorithm method explores, in an efficient way, the space of all possible subsets to obtain the set of features that maximizes the predictive accuracy of the learned rules. The GA method uses the GENESIS system to globally search the space. It uses an Evaluation Function for providing a feedback about the fitness of each feature subset. The comparison is done on three real world problems, wind bracing design, accident data, and Soybean data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Selection Methods: Genetic Algorithms vs. Greedy-like Search

This paper presents a comparison between two feature selection methods, the Importance Score (IS) which is based on a greedy-like search and a genetic algorithm-based (GA) method, in order to better understand their strengths and limitations and their area of application. The results of our experiments show a very strong relation between the nature of the data and the behavior of both systems. ...

متن کامل

تعیین ماشین‌های بردار پشتیبان بهینه در طبقه‌بندی تصاویر فرا طیفی بر مبنای الگوریتم ژنتیک

Hyper spectral remote sensing imagery, due to its rich source of spectral information provides an efficient tool for ground classifications in complex geographical areas with similar classes. Referring to robustness of Support Vector Machines (SVMs) in high dimensional space, they are efficient tool for classification of hyper spectral imagery. However, there are two optimization issues which s...

متن کامل

Effective Wrapper-Filter hybridization through GRASP Schemata

Of all of the challenges which face the selection of relevant features for predictive data mining or pattern recognition modeling, the adaptation of computational intelligence techniques to feature selection problem requirements is one of the primary impediments. A new improved metaheuristic based on Greedy Randomized Adaptive Search Procedure (GRASP) is proposed for the problem of Feature Sele...

متن کامل

Optimal Feature Selection for Data Classification and Clustering: Techniques and Guidelines

In this paper, principles and existing feature selection methods for classifying and clustering data be introduced. To that end, categorizing frameworks for finding selected subsets, namely, search-based and non-search based procedures as well as evaluation criteria and data mining tasks are discussed. In the following, a platform is developed as an intermediate step toward developing an intell...

متن کامل

Optimal Feature Selection for Data Classification and Clustering: Techniques and Guidelines

In this paper, principles and existing feature selection methods for classifying and clustering data be introduced. To that end, categorizing frameworks for finding selected subsets, namely, search-based and non-search based procedures as well as evaluation criteria and data mining tasks are discussed. In the following, a platform is developed as an intermediate step toward developing an intell...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001